-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(eks): endpoint access customization #9095
Conversation
Title does not follow the guidelines of Conventional Commits. Please adjust title before merge. |
providerProps = { | ||
...providerProps, | ||
vpc: this.vpc, | ||
vpcSubnets: { subnetType: ec2.SubnetType.PRIVATE }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious - Does that mean it will just pick up all the private subnets and we would not be able to specify which subnets?
The reason for asking this is that in our context, 1 VPC will have 10+ private subnets across 3 AZs. So if we don't have a choice to specify subnets as a comma separated string, then the corresponding Custom Resource Handler Lambda might want to bind to all those subnets and it would fail in our context due to Lambda cannot associate with multiple subnets in one AZ.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JasperW01 Good call out.
We actually already accept a subnet selection for the VPC:
aws-cdk/packages/@aws-cdk/aws-eks/lib/cluster.ts
Lines 130 to 144 in e8c0b58
* Where to place EKS Control Plane ENIs | |
* | |
* If you want to create public load balancers, this must include public subnets. | |
* | |
* For example, to only select private subnets, supply the following: | |
* | |
* ```ts | |
* vpcSubnets: [ | |
* { subnetType: ec2.SubnetType.Private } | |
* ] | |
* ``` | |
* | |
* @default - All public and private subnets | |
*/ | |
readonly vpcSubnets?: ec2.SubnetSelection[]; |
So I guess this should be:
vpcSubnets: props.vpcSubent || { subnetType: ec2.SubnetType.PRIVATE },
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @iliapolo , good pick up.
I guess it will do the trick. But on the other hand, it would put an assumption that the user has only selected 1 subnet per AZ as the input parameter for cluster creation. However it may or may not the case. If not, then the CR Handler Lambda VPC association with subnets will still fail due to more than one subnet per AZ.
So I could think of two choices:
Option i. Use "vpcSubnets: props.vpcSubent || { subnetType: ec2.SubnetType.PRIVATE }" as the fix here, but specifically document that - when creating cluster, please only choose 1 subnet per AZ for control plane and later you can still add data plane to other subnets in the same VPC anyway, as stated in the section "Sidebar: Do my worker nodes need to be in the same subnets as the ones I provided when I started the cluster?" of https://aws.amazon.com/blogs/containers/de-mystifying-cluster-networking-for-amazon-eks-worker-nodes/.
(BTW, as a side point - for customer who has complex subnets in the VPC, this key point stated in this blog is quite important but not stated clearly in the EKS official document though. There is just a general link to this blog quoted in the EKS official document - https://docs.aws.amazon.com/eks/latest/userguide/create-public-private-vpc.html. I guess it might be worthwhile to call it out specifically in the cluster-creation sections of EKS official documents which will be very useful for customer who has many subnets in VPC rightly or wrongly. )
Option ii. Leave a specific option here to allow customer to specify subnets for the Custom Resource Handler Lambda to associate, so the customer has the chance to make sure only 1 subnet per AZ here.
There might be other choices. By weighing simplicity & consistency with flexibility, my personal opinion would lean towards option i. But I guess to decide which one is better also needs to consider other factors of this aws-eks module holistically so I could be totally wrong to prefer option i.
Anyway just my 2 cents for your reference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JasperW01 Thanks again for the input. Let me ponder this a bit 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @JasperW01 - So you mentioned earlier:
Handler Lambda might want to bind to all those subnets and it would fail in our context due to Lambda cannot associate with multiple subnets in one AZ
Are you sure about this? Im asking because I ran this scenario and saw that it does actually work. For example, this is my lambda VPC config:
Notice that there are 3 subnets in each az, and the lambda binds to all of them. Am I missing something? Is there documentation somewhere about this multiple subnets thing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @iliapolo , this is interesting and challenging my memory. I remember recently when I was provisioning Lambda with VPC association, it failed with an error message of "failed to associate multiple subnets per AZ". I will retrospect what happened and get back to you later today.
If this Lambda subnet binding concern turns out to be a false alarm, then I guess "vpcSubnets: props.vpcSubent || { subnetType: ec2.SubnetType.PRIVATE }," would be all cool. In our context, we have around 20 subnets segregated to multiple management zones. So we need to specify the selected subnets by using "readonly vpcSubnets?: ec2.SubnetSelection[];" when creating EKS cluster, so it would be nice to align the Handler Lambda subnets with EKS control plane's subnets where the EKS control ENIs will sit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @iliapolo I dug a bit deeper on my side and also double checked with my team mate. Both of us clearly remember we did encounter such error previously this year while using CDK to provision Lambda with VPC association. But funny thing is we could not replicate the exact error any more today so we suspect it might have been some CDK related issue and have been fixed along the way.
On the other hand, today we did generate another issue when blanketly using ec2.SubnetType.PRIVATE to create Lambda in CDK codes in our env. The error is it exceeds 16 subnet limit as shown below:
so I guess it still makes sense to align the Handler Lambda subnet association with the EKS control plane subnet selection.
packages/@aws-cdk/aws-eks/lib/cluster-resource-handler/cluster.ts
Outdated
Show resolved
Hide resolved
packages/@aws-cdk/aws-eks/lib/cluster-resource-handler/cluster.ts
Outdated
Show resolved
Hide resolved
packages/@aws-cdk/aws-eks/lib/cluster-resource-handler/cluster.ts
Outdated
Show resolved
Hide resolved
Hi @iliapolo , just realize I forgot one thing apart from the subnet selection we already discussed above. In the enterprise context like our case, the customer does not use the native AWS VPC NAT Gateway to go out to internet. Instead, workloads hosted in VPC need to use customer's enterprise internet proxy to go out to internet. So for Lambdas associated with customer VPC, we have to set up "https_proxy", "http_proxy" & "no_proxy" environment variables which will be automatically honored by python Lambda to allow those Lambdas to go out to internet. This same applies to the Custom Resource (CR) Handler Lambda (python) of "onEvent handler for EKS kubectl resource provider", for which to go out to internet to retrieve K8S manifest files & helm charts. Hence is it possible to add an optional props item to allow customer to specify a set of environment variables for such CR Handler Lambda? Thx a lot in advance. |
@JasperW01 Thanks, yeah we should take this into account. I'm not keen on exposing this handler properties to users since its somewhat of an implementation detail. But it seems like the practical approach here. @eladb Thoughts? @JasperW01 You also mentioned:
For what I can tell the handler only goes out to the internet for helm charts, manifests are embedded in the CFN template. Am I missing something? P.S: Just for my curiosity, is there any way you can share your VPC configuration that you'll be passing to the cluster? (redacted of course). |
Hi @iliapolo, thx a lot for the swift follow up. Q1: "For what I can tell the handler only goes out to the internet for helm charts, manifests are embedded in the CFN template. Am I missing something?"
Q2: Just for my curiosity, is there any way you can share your VPC configuration that you'll be passing to the cluster?
|
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
Thank you for contributing! Your pull request will be updated from master and then merged automatically (do not update manually, and be sure to allow changes to be pushed to your fork). |
Hi @iliapolo , I just tried the new feature in 1.57.0 of supporting private endpoint. I used the following lines to create a private cluster. But it keeps failing with the error of The following resource(s) failed to create: [Handler886CB40B]. And the screenshot of the nested stack error is captured as well. I suspect the reason is the VPC config for the Handler Lambda in the nested stack is not correct. Could you please have a look? Thx a lot in deed. Cheers.
|
Hi, @iliapolo , just an add up - the vpc is imported via ec2.Vpc.fromLookup. Also after I comment out the line of "eks.EndpointAccess.PRIVATE" above, the creation is successful. The cluster K8S endpoint is private + public as expected. And inside the nested stack template, the handler lambda has the kubectlEnvironment setting and has no VpcConfig setting (probably that is the reason that the nested stack creation is successful.) |
Also I tried eks.Cluster as well, and got the same error.
|
Hi @JasperW01 - Im looking into it. Could you please attach the synthesized CloudFormation resource for the handler? Also - just making sure, your VPC has private subnets in it - right? |
Hi @iliapolo , thx a lot for the swift response. Yes, the VPC I used to test has 2 private & 2 public subnets. Please find the synthesised stack sets which includes one for fargate-only cluster and one for no-default-capacity cluster. They are synthesised from the CDK codes quoted above. As you can see, in both nested stacks which include kubectl handler lambda, the vpcConfig setting is shown as below which caused the failure since the securityid is populated but the subnetids is empty. I guess the issue can be replicated on other dev laptop.
[ |
@JasperW01 Interesting. Could you also share the code you use to instantiate the VPC - you mentioned its |
Hi @iliapolo , that VPC in my account is not generated by CDK codes. It's I created beforehand using the AWS VPC console (i.e. the VPC creation wizard to create a VPC with 2 public & private subnets). The reason for that is that in most enterprise environment, app team is not allowed to create VPC with their CDK/CFN stack, rather they have to use the existing VPC created by foundation team, and select the proper existing subnets in their CDK. Hence I am trying to mimic the same with my own account while testing CDK packs. |
BTW, just curious, do you mean if I choose to create VPC together with the EKS cluster via the same CDK pack, the issue won't occur? |
@JasperW01 I understand. I was interested in the code you use inside you CDK to instantiate (not actually create the VPC) the Vpc construct. i.e: const vpc = Vpc.fromLookup()... |
It should work with your scenario as well, my suspicion right now is that for some reason the subnets aren't selected properly for looked up VPC's. If you can try and create a VPC in your CDK stack as well that would be good, just a validation. |
oh, sorry, misunderstand you question. Please find the full codes below quoted. export interface EksSelfCdkStackProps extends cdk.StageProps { export class EksSelfCdkStack extends cdk.Stack {
} |
Will try as a validation. But fundamentally we need to use existing VPC. BTW, I assume you are using tags to select the subnets. here are the tags of my subnets in the VPC. Importantly it is shared by two EKS clusters, i.e. two kubernetes.io/cluster/* = 'shared". Not sure if it's relevant. |
@JasperW01 These are subnets created externally from the CDK using the console? Or are these the test subnets you are now validating with your account? I'm a little confused because the tags contain |
ah, sorry for the confusion, mate. The VPC I select was created by the CDK several months ago when I play with the AWS modern application - https://github.com/aws-samples/aws-modern-application-workshop/tree/python-cdk. It's NOT created by AWS console as I mentioned previously. |
Hi @iliapolo , BTW, do you need me to also upload the cdk.context.json file which has the VPC information imported by ec2.vpc.fromLookup? |
@JasperW01 sure that would be good as well 👍 |
@iliapolo here it is. BTW, I have tried to remove the ckd.context.json, change to use another vpcId as the input for ec2.fromLookup, and synthesise the CDK. The exported nested stack for handler Lambda still has the same problem with vpcConfig, i.e. securityIds are populated but subnetIds is empty. So I guess you might be able to replicate the same issue on your side if you choose to import an existing vpc via ec2.fromLookup and use it as the vpc input for eks.Cluster(). On the other hand, I have raised a request to increase my VPC number quota (used up all 5 at the moment), so I can validate the situation of choosing to create vpc along with the EKS cluster creation via new eks.Cluster(). |
@JasperW01 I found the issue, will get a PR out early this week. You can track it here: #9542. Thanks for reporting! |
Add an option to configure endpoint access to the cluster control plane. Resolves #5220 In addition, there is now a way to pass environment variables into the kubectl handler. This is necessary for allowing private VPCs (with no internet access) to use an organizational proxy when installing Helm chart and in general needing to access the internet. See #9095 (comment). BREAKING CHANGE: endpoint access is configured to private and public by default instead of just public ---- *By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
Add an option to configure endpoint access to the cluster control plane. Resolves aws#5220 In addition, there is now a way to pass environment variables into the kubectl handler. This is necessary for allowing private VPCs (with no internet access) to use an organizational proxy when installing Helm chart and in general needing to access the internet. See aws#9095 (comment). BREAKING CHANGE: endpoint access is configured to private and public by default instead of just public ---- *By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
@iliapolo :: The EKS documentation states that is possible to create a cluster w/o internet connectivity; However, if I try to deploy a private cluster on a VPC that only has 'isolated' subnets I receive the following error:
... why I'm been forced to deploy a nat gw and others for a 100% private (isolated) cluster? Should I file an issue for this ? |
For those facing the same issue: It was a programming error on my side. I was constructing the VPC like this
... and the cluster like this:
The moment I've change that |
Add an option to configure endpoint access to the cluster control plane.
Resolves #5220
In addition, there is now a way to pass environment variables into the kubectl handler. This is necessary for allowing private VPCs (with no internet access) to use an organizational proxy when installing Helm chart and in general needing to access the internet. See #9095 (comment).
BREAKING CHANGE: endpoint access is configured to private and public by default instead of just public
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license